Essv 2012 C Ontinuous Speech Recognition Using C Orrelation Features and Structured Svm Probability Output
نویسندگان
چکیده
One potential area for improvement in continuous speech recognition is the modelling of phoneme transitions (not transition probabilties) arising from the non-stationarity of speech: refined models can then be used to compute probability distributions which can serve as emission probabilities for HMM-based speech recognition systems. In this paper we present our approach to improving phoneme transition modelling. Building on our previous work, we employ a phoneme partition approach (SME: start, middle, and end states) to build a structure of support vector (SV) classifiers as our main discriminative method. For the phoneme classification step, cross correlation features based on MFCC-vectors are computed and classified within the SME structure. Additionally, we make use of a special reproducing kernel build upon the correlation features, thus offering a direct integration into the SV classifiers. This paper discusses the computation of the afore-mentioned probability outputs as well as initial results using these outputs as emission probabilities in HMMs representing phonemes, applied within a standard speech recognition system.
منابع مشابه
بهبود عملکرد سیستم بازشناسی گفتار پیوسته بوسیله ویژگیهای استخراج شده از مانیفولدهای گفتاری در فضای بازسازی شده فاز
The design for new feature extraction methods out of the speech signal and combination of their obtained information is one of the most effective approaches to improve the performance of automatic speech recognition (ASR) system. Recent researches have been shown that the speech signal contains nonlinear and chaotic properties, but the effects of these properties are not used in the continuous ...
متن کاملHybrid MLP/structured-SVM tandem systems for large vocabulary and robust ASR
Tandem systems based on multi-layer perceptrons (MLPs) have improved the performance of automatic speech recognition systems on both large vocabulary and noisy tasks. One potential problem of the standard Tandem approach, however, is that the MLPs generally used do not model temporal dynamics inherent in speech. In this work, we propose a hybrid MLP/Structured-SVM model, in which the parameters...
متن کاملObject Recognition based on Local Steering Kernel and SVM
The proposed method is to recognize objects based on application of Local Steering Kernels (LSK) as Descriptors to the image patches. In order to represent the local properties of the images, patch is to be extracted where the variations occur in an image. To find the interest point, Wavelet based Salient Point detector is used. Local Steering Kernel is then applied to the resultant pixels, in ...
متن کاملUsing output probability distribution for improving speech recognition in adverse environment
This paper proposed a method to improve the accuracy of small vocabulary isolated word speaker-independent speech recognition in adverse environment. The proposed approach is implemented by using Output Probability Distributions (OPDs) and Support Vector Machine (SVM). OPDs improve the system performance by modeling inter-word relationships; then SVM classifiers are used to discriminate the dif...
متن کاملCalibrated Structured Prediction
In user-facing applications, displaying calibrated confidence measures— probabilities that correspond to true frequency—can be as important as obtaining high accuracy. We are interested in calibration for structured prediction problems such as speech recognition, optical character recognition, and medical diagnosis. Structured prediction presents new challenges for calibration: the output space...
متن کامل